Controlling Bidirectional Parsing

نویسنده

  • Fabio Ciravegna
چکیده

Traditional models of parsing as used in interfaces have shown to be weak and ine ective in complex tasks such as processing of naturally-occurring texts. Broad coverage parsers go mad when confronted with extended inputs without su cient information to control the interpretation process. The use of e ective control strategies is necessary to overcome these shortcomings. Extra-linguistic criteria (e.g. statistics, semantic or goal-driven heuristics) can be employed to reduce the combinatorics of parsing and to avoid misdirected e orts, focusing the analysis on the most promising solutions. In this paper we present an approach to text parsing where an agendabased bidirectional chart parser is coupled with a set of extra-linguistic control strategies. Those strategies a ect the agenda management favoring some tasks, delaying or pruning others. A preprocessing phase elaborate the text in order to collect information for the parsing control: text segmentation produces hints on the text supercial structure, and statistical classi cation provides the templates to ll. Afterwards the linguistic analyzer processes the text in two steps: segment parsing and segment combination. The control strategies are mainly applied during the latter phase. Currently implemented strategies integrate both general and domain-speci c criteria such as grammar rule scores and goal-driven (i.e. template-based) information. The two-step linguistic analyzer is able to gracefully cope with the linguistic complexity of the input, pursuing a complete analysis only when it is feasible; the feature of bidirectionality allows to maximize the coverage on chunks of input. Some preliminary results seem to show an improvement of the parsing e ciency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Potsdam: Semantic Dependency Parsing by Bidirectional Graph-Tree Transformations and Syntactic Parsing

We present the Potsdam systems that participated in the semantic dependency parsing shared task of SemEval 2014. They are based on linguistically motivated bidirectional transformations between graphs and trees and on utilization of syntactic dependency parsing. They were entered in both the closed track and the open track of the challenge, recording a peak average labeled F1 score of 78.60.

متن کامل

LTAG Dependency Parsing with Bidirectional Incremental Construction

In this paper, we first introduce a new architecture for parsing, bidirectional incremental parsing. We propose a novel algorithm for incremental construction, which can be applied to many structure learning problems in NLP. We apply this algorithm to LTAG dependency parsing, and achieve significant improvement on accuracy over the previous best result on the same data set.

متن کامل

Bidirectional Automata for Tree Adjoining Grammars

We define a new model of automata for the description of bidirectional parsing strategies for tree adjoining grammars and a tabulation mechanism that allow them to be executed in polynomial time. This new model of automata provides a modular way of describing bidirectional parsing strategies for TAG, separating the description of a strategy from its execution.

متن کامل

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations

We present a simple and effective scheme for dependency parsing which is based on bidirectional-LSTMs (BiLSTMs). Each sentence token is associated with a BiLSTM vector representing the token in its sentential context, and feature vectors are constructed by concatenating a few BiLSTM vectors. The BiLSTM is trained jointly with the parser objective, resulting in very effective feature extractors ...

متن کامل

Bidirectional Dependency Parser for Hindi, Telugu and Bangla

This paper describes the dependency parser we used in the NLP Tools Contest, 2009 for parsing Hindi, Bangla and Telugu. The parser uses a bidirectional parsing algorithm with two operations proj and non-proj to build the dependency tree. The parser obtained Labeled Attachment Score of 71.63%, 59.86% and 67.74% for Hindi, Telugu and Bangla respectively on the treebank with fine-grained dependenc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995